Space Efficiency in Synopsis Construction Algorithms

نویسنده

  • Sudipto Guha
چکیده

Histograms and Wavelet synopses have been found to be useful in query optimization, approximate query answering and mining. Over the last few years several good synopsis algorithms have been proposed. These have mostly focused on the running time of the synopsis constructions, optimum or approximate, vis-a-vis their quality. However the space complexity of synopsis construction algorithms has not been investigated as thoroughly. Many of the optimum synopsis construction algorithms (as well as few of the approximate ones) are expensive in space. In this paper, we propose a general technique that reduces space complexity. We show that the notion of “working space” proposed in these contexts is redundant. We believe that our algorithm also generalizes to a broader range of dynamic programs beyond synopsis construction. Our modifications can be easily adapted to existing algorithms. We demonstrate the performance benefits through experiments on real-life and synthetic data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

How far will you walk to find your shortcut: Space Efficient Synopsis Construction Algorithms

In this paper we consider the wavelet synopsis construction problem without the restriction that we only choose a subset of coefficients of the original data. We provide the first near optimal algorithm. We arrive at the above algorithm by considering space efficient algorithms for the restricted version of the problem. In this context we improve previous algorithms by almost a linear factor an...

متن کامل

Offline and Data Stream Algorithms for Efficient Computation of Synopsis Structures

Synopsis and small space representations are important data analysis tools and have long been used OLAP/DSS systems, approximate query answering, query optimization and data mining. These techniques represent the input in terms broader characteristics and improve efficiency of various applications, e.g., learning, classification, event detection, among many others. In a recent past, the synopsi...

متن کامل

A Survey of Synopsis Construction in Data Streams

The large volume of data streams poses unique space and time constraints on the computation process. Many query processing, database operations, and mining algorithms require efficient execution which can be difficult to achieve with a fast data stream. In many cases, it may be acceptable to generate approximate solutions for such problems. In recent years a number of synopsis structures have b...

متن کامل

Optimality and Scalability in Lattice Histogram Construction

The Lattice Histogram is a recently proposed data summarization technique that achieves approximation quality preferable to that of an optimal plain histogram. Like other hierarchical synopsis methods, a lattice histogram (LH) aims to approximate data using a hierarchical structure. Still, this structure is not defined a priori; it consists an unknown, not a given, of the problem. Past work has...

متن کامل

Efficient Haar+ Synopsis Construction for the Maximum Absolute Error Measure

Several wavelet synopsis construction algorithms were previously proposed based on dynamic programming for unrestricted Haar wavelet synopses as well as Haar synopses. However, they find an optimal synopsis for every incoming value in each node of a coe cient tree, even if di↵erent incoming values share an identical optimal synopsis. To alleviate the limitation, we present novel algorithms, whi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005